I have provided you with data about mortality from all 50 states and the District of Columbia. Please access it at https://github.com/charleyferrari/CUNY_DATA608/tree/master/module3/data You are invited to gather more data from our provider, the CDC WONDER system, at https://wonder.cdc.gov/ucd-icd10.html.
This assignment must be done in R. It must be done using the ‘shiny’ package.
It is recommended you use an R package that supports interactive graphing such as plotly, or vegalite, but this is not required.
Your apps must be deployed, I won’t be accepting raw files. Luckily, you can pretty easily deploy apps with a free account at shinyapps.io
library("knitr")
library("rmarkdown")
knitr::opts_chunk$set(comment = NA)
library(tidyverse)
library(dplyr)
library(tidyr)
library(tibble)
library(reshape2)
library(stringr)
library(plotly)
library(shiny)
library(rsconnect)
df <-
read.csv('https://raw.githubusercontent.com/charleyferrari/CUNY_DATA608/master/lecture3/data/cleaned-cdc-mortality-1999-2010-2.csv', header= TRUE, stringsAsFactors=TRUE)
str(df)
'data.frame': 9961 obs. of 6 variables:
$ ICD.Chapter: Factor w/ 19 levels "Certain conditions originating in the perinatal period",..: 2 2 2 2 2 2 2 2 2 2 ...
$ State : Factor w/ 51 levels "AK","AL","AR",..: 2 2 2 2 2 2 2 2 2 2 ...
$ Year : int 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 ...
$ Deaths : int 1092 1188 1211 1215 1350 1251 1303 1312 1241 1385 ...
$ Population : int 4430141 4447100 4467634 4480089 4503491 4530729 4569805 4628981 4672840 4718206 ...
$ Crude.Rate : num 24.6 26.7 27.1 27.1 30 27.6 28.5 28.3 26.6 29.4 ...
Cause-based 2010 State Crude Mortality Rates
As a researcher, you frequently compare mortality rates from particular causes across different States. You need a visualization that will let you see (for 2010 only) the crude mortality rate, across all States, from one cause (for example, Neoplasms, which are effectively cancers). Create a visualization that allows you to rank States by crude mortality for each cause of death.
Filter by year and COD, displaying all State COD data. Organize by State due to the number of states and the difficulty locating one if sorting by Crude Rate.
cod = "Neoplasms"
pp1 <- df %>%
filter(., Year == "2010" & ICD.Chapter == cod) %>% # Filter by year and COD
arrange(desc(State), Crude.Rate) # Reverse the State order to flip axes
head(pp1)
ICD.Chapter State Year Deaths Population Crude.Rate
1 Neoplasms WY 2010 1053 563626 186.8
2 Neoplasms WV 2010 4797 1852994 258.9
3 Neoplasms WI 2010 11644 5686986 204.7
4 Neoplasms WA 2010 12140 6724540 180.5
5 Neoplasms VT 2010 1431 625741 228.7
6 Neoplasms VA 2010 14425 8001024 180.3
chart1 <- pp1 %>%
plot_ly(x = ~pp1$Crude.Rate, y = pp1$State, type="bar", orientation="h") %>%
layout(
title= list(text=paste0(cod,"\ncause-based 2010 State Crude Mortality Rates"),
font=list(size = 10)),
xaxis=list(title="Crude Rate"),
yaxis=list(title="States",
categoryorder = "array",
categoryarray = rev(~State)) # Flip axes
)
subplot(chart1)
sidebarPanel(
selectInput("cod", label = "Cause of Death:",
choices = df$ICD.Chapter)
)
mainPanel(
plotlyOutput('chart2')
)
renderPlot({
pp1 <- df %>%
filter(., Year == "2010" & ICD.Chapter == input$cod) %>% # Filter by year and COD
arrange(desc(State), Crude.Rate) # Reverse the State order to flip axes
chart1 <- pp1 %>%
plot_ly(x = ~pp1$Crude.Rate, y = pp1$State, type="bar", orientation="h") %>%
layout(
title= list(text=paste0(input$cod,"\ncause-based 2010 State Crude Mortality Rates"),
font=list(size = 10)),
xaxis=list(title="Crude Rate"),
yaxis=list(title="States",
categoryorder = "array",
categoryarray = rev(~State)) # Flip axes
)
subplot(chart1)
})
Comparing Crude Death Rates by Cause and Year
Often you are asked whether particular States are improving their mortality rates (per cause) faster than, or slower than, the national average. Create a visualization that lets your clients see this for themselves for one cause of death at the time. Keep in mind that the national average should be weighted by the national population.
cod = "Certain infectious and parasitic diseases"
pp2 <- df %>%
group_by(Year, ICD.Chapter) %>% # Group by year
filter(ICD.Chapter == cod) %>% # Filter by Cause of Death
mutate(US.Crude.Rate = round(
sum(Deaths) / sum(Population) * 100000),3) %>% # Create National Crude Rate per Year per CDC
#https://wonder.cdc.gov/wonder/help/ucd.html#Rates
group_by(Year, ICD.Chapter, State)
head(pp2)
# A tibble: 6 x 8
# Groups: Year, ICD.Chapter, State [6]
ICD.Chapter State Year Deaths Population Crude.Rate US.Crude.Rate `3`
<fct> <fct> <int> <int> <int> <dbl> <dbl> <dbl>
1 Certain infectio~ AL 1999 1092 4430141 24.6 21 3
2 Certain infectio~ AL 2000 1188 4447100 26.7 21 3
3 Certain infectio~ AL 2001 1211 4467634 27.1 21 3
4 Certain infectio~ AL 2002 1215 4480089 27.1 22 3
5 Certain infectio~ AL 2003 1350 4503491 30 22 3
6 Certain infectio~ AL 2004 1251 4530729 27.6 22 3
state="AL"
chart2 <- pp2 %>%
as_tibble() %>%
filter(., State == state) %>%
select(., Year, Crude.Rate, US.Crude.Rate) %>%
plot_ly(x = ~Year, y = ~Crude.Rate, type='bar',
text = ~Crude.Rate, textposition = 'auto',
marker = list(color = 'rgb(158,202,225)'),
name = 'State') %>% # Chart State and US Crude Rates next to one another
add_trace(x = ~Year, y = ~US.Crude.Rate, type='bar', #https://plotly.com/chart-studio-help/documentation/r/bar-charts/
text = ~US.Crude.Rate, textposition = 'auto',
marker = list(color = 'rgb(58,200,225)'), name = 'US') %>%
layout(
title= list(text=paste0(cod,"\ncause-based State and US Crude Death Rates by Year\nfor ",state),
font=list(size = 10)),
barmode = 'group',
xaxis = list(title = "Year"),
yaxis = list(title = "Crude Death Rate"))
subplot(chart2)
sidebarPanel(
selectInput("state", "State:",
choices=df$State),
selectInput("cod", "Cause of Death:",
choices=df$ICD.Chapter)
)
mainPanel(
plotlyOutput('chart2')
)
renderPlot({
pp2 <- df %>%
group_by(Year, ICD.Chapter) %>%
filter(ICD.Chapter == input$cod) %>%
mutate(US.Crude.Rate = round(
sum(Deaths) / sum(Population) * 100000),3) %>%
group_by(Year, ICD.Chapter, State)
chart2 <- pp2 %>%
as_tibble() %>%
filter(., State == input$state) %>%
select(., Year, Crude.Rate, US.Crude.Rate) %>%
plot_ly(x = ~Year, y = ~Crude.Rate, type='bar',
text = ~Crude.Rate, textposition = 'auto',
marker = list(color = 'rgb(158,202,225)'),
name = 'State') %>%
add_trace(x = ~Year, y = ~US.Crude.Rate, type='bar',
text = ~US.Crude.Rate, textposition = 'auto',
marker = list(color = 'rgb(58,200,225)'), name = 'US') %>%
layout(
title= list(text=paste0(input$cod,"\ncause-based State and US Crude Death Rates by Year\nfor ",input$state),
font=list(size = 10)),
barmode = 'group',
xaxis = list(title = "Year"),
yaxis = list(title = "Crude Death Rate"))
subplot(chart2)
})
I am still working on the display of Shiny in R Studio R Markdown Files
https://bookdown.org/yihui/rmarkdown/shiny-widgets.html#the-shinyapp-function